61 research outputs found
Learning Structured Priors with Optimization-based Modeling
Underpinning the success of deep learning is effective structural prior modeling schemes that allow a broad range of domain-specific knowledge in data to be naturally encoded in a deep learning architecture. For example, in the computer vision community, convolutional neural networks implicitly encode transformation invariances (e.g., rotation and translation) by learning shareable weights across spatial domain of images. For sequential data, such as natural language sentences and speech utterances, recurrent neural networks are another class of architectures that perceive sequential order and capture the dependence among inputs. Besides advanced network architecture, one of the most prevalent approach to incorporating structural priors is regularization, which usually results in a complex non-convex optimization problem and creates contention between performance of end tasks and faithful of regularization.
We argue in this thesis that optimization methods provide an expressive set of primitive operations that allow us to integrate structural priors into the modeling pipeline without interference the learning of end tasks. We first propose inserting proximal mapping as a hidden layer to the deep neural network, which directly and explicitly produces well regularized hidden layer outputs. The resulting technique is shown well connected to kernel warping and dropout, and novel algorithms were developed for robust temporal learning and multiview learning. Next, we extend our framework to learn well regularized functions which project given inputs to structured outputs. As an instantiation of this approach, we aim to solve an unsupervised domain adaptation problem in which the minimax game leads to the training process unstable. A bi-level optimization based approach was proposed to decouple the minimax optimization so that the model enjoys a much more principled and efficient training procedure. In addition, our method warping probability discrepancy measures towards the end tasks by leveraging the pseudo-labels produced by the optimal predictor.
We validate our proposed methods through extensive experiments including image classification, speech recognition, cross-lingual word embedding, and domain adaptation. Our methods demonstrate a number of benefits over other baseline methods as we achieved state-of-the-art performance in various supervised and unsupervised learning tasks
Novel Online Dimensionality Reduction Method with Improved Topology Representing and Radial Basis Function Networks
<div><p>This paper presents improvements to the conventional Topology Representing Network to build more appropriate topology relationships. Based on this improved Topology Representing Network, we propose a novel method for online dimensionality reduction that integrates the improved Topology Representing Network and Radial Basis Function Network. This method can find meaningful low-dimensional feature structures embedded in high-dimensional original data space, process nonlinear embedded manifolds, and map the new data online. Furthermore, this method can deal with large datasets for the benefit of improved Topology Representing Network. Experiments illustrate the effectiveness of the proposed method.</p></div
Asymmetric Diels–Alder Reaction of α,β-Unsaturated Oxazolidin-2-one Derivatives Catalyzed by a Chiral Fe(III)-Bipyridine Diol Complex
An asymmetric Fe<sup>III</sup>-bipyridine diol catalyzed Diels–Alder
reaction of α,β-unsaturated oxazolidin-2-ones has been
developed. Among various Fe<sup>II</sup>/Fe<sup>III</sup> salts, FeÂ(ClO<sub>4</sub>)<sub>3</sub>·6H<sub>2</sub>O was selected as the Lewis
acid of choice. The use of a low catalyst loading (2 mol % of FeÂ(ClO<sub>4</sub>)<sub>3</sub>·6H<sub>2</sub>O and 2.4 mol % of Bolm’s
ligand) afforded high yields (up to 99%) and high enantiomeric excesses
(up to 98%) of <i>endo</i>-cycloadducts for the Diels–Alder
reaction between cyclopentadiene and substituted acryloyloxazolidin-2-ones.
Other noncyclic dienes led to decreased enantioselectivities. A proposed
model supports the observed stereoinduction
Connecting the subgraphs in ITRN step 10.
<p>The dataset is formed of randomly generated nodes comprising five non-overlapping clusters (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0131631#pone.0131631.s001" target="_blank">S1 Dataset</a>). Black dots indicate the training patterns (500 nodes), and blue circles indicate the codebook vectors (100 vectors). In addition, the blue solid lines are established by ITRN steps 1–9 and the dotted lines are established by ITRN step 10.</p
Comparison of TRN and ITRN.
<p>Black dots indicate the training patterns, and blue circles indicate codebook vectors. In the first experiment, 20 randomly generated training patterns (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0131631#pone.0131631.s001" target="_blank">S1 Dataset</a>) and 10 codebooks were selected, and (a) and (b) show the results generated by TRN and ITRN, respectively. In the second experiment, 100 randomly generated training patterns (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0131631#pone.0131631.s001" target="_blank">S1 Dataset</a>) and 25 codebooks were selected, and (c) and (d) show the results generated by TRN and ITRN, respectively.</p
Values of quality metrics for ITRN-RBF and classical dimensionality reduction methods.
<p>Values of quality metrics for ITRN-RBF and classical dimensionality reduction methods.</p
- …